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ABSTRACT 

A two phase saapling aethod and the 
optiaal sanpling segment dimensions are devel> 
<q»ed for the estimation of the sugar cane 
cultivated area. This technique employs visual 
interprets (.ions of LANDSAT images and pan* 
chromatic aerial photographs considered as the 
ground truth. The estimates, as a mean value 
of 100 simulated samples, represent 99. 3t of- 
the true value with a CV of approximately It; 
the relative efficiency of the two phase design 
was 1S7t when compared with a one phase aerial 
photographs sample. 


1 . INTRODUCTION 

In this paper a statistical system to estimate the sugar cane (Saccharum 
spp.) cultivated area in a subregion of Sio Paulo State, Brazil, is presented. 
The region under study is known as the Great Region of Jau, located in the 
central part of the state, within the parallels 22^00' and 23<*00' South, and 
meridian AS^’OO' and 49**00' Nest, covering 5046 km*. 

For this region, the sugar cane acreage was determined by two different 
approaches. In the first approach, Koffler at al., (1980) made interpretations 
of panchromatic aerial photographs in the scale of 1:35000 and 1:45000, 
complemented with a rigorous field control. This information is considered as 
the ground truth. 

#'• 

in the second approach, Mendonca et al. (1981) used LANDSAT images in the 
scale of 1:250000 and visual interpretation, associating the spectral 
variations of the crop with temporal variations in different satellite pass. 
Channels 5 and 7 were used. 

With this basic information, a saaq>ling experiment was designed to 
estimate the sugar cane acreage and awasure the error rate in relation to the 
ground truth. The applied sampling awthod was a t«ro phase saapling with 
regression estimates. 


* Presented at the Sixteenth International Sysq>osiua on Remote Sensing of 

Environment, Buenos Aires, Argentina, June 2-9, 1982. 
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2. STUDY OSJECTIVES 

•) To detersive segments size in order to have the maxianm correlation 
betNeen LANDSAT data and groun.'. truth. 

b) To estinate the sugar cane acreage with a sampling design using 

LANDSAT data and ground truth and considering the cost of each source 
of data. 

3. SEGMENTS SIZE AND SAMPLING METHOD 
a) Determination of segnents size 

With a grid of size 1 cm by 1 cm (dbrresponding to 2.S km by 2.S km) 
applied on a composition of LAh.;SAT images and on a mosaic of aerial 

S ootographs, the number of points with sugar cane in each unitary segment was 
etermined. 

From that information, a subarea of 60 km x 37.5 km was selected and 
a uniformity trial (Federer, 1967) was designed., The response variable was the 
number of hectares cultivated with sugar cane by segment of 6. 25 km*. 

Ten deferent types of segments were considered: 

No. of 
Segments 

360 
168 
120 
72 
84 
36 
36 
40 
24 
18 


Type Size 


1x1 

2.5 X 

2.5 

■ 

6.25 

km* 

1 X 2 

2.5 X 

5.0 

■ 

12.50 

km> 

1x3 

2.5 X 

7.5 


18.75 

km* 

1x4 

2.5 X 

10.0 

• 

25.00 

km* 

2x2 

5.0 X 

5.0 

• 

25.00 

km* 

2x3 

5.0 X, 

7:5 

■ 

37.50 

km* 

2x4 

5.0 X 

10.0 

■ 

50.00 

km* 

3x3 

7.5 X 

7.5 

m 

56.25 

km* 

3x1 

7.5 X 

10.0 

m 

75.00 

km* 

lx . 

10.0 X 

10.0 

m 

100.00 

km* 


The criterion of selection was the maximum correlation between the 
variables "response to the LANDSAT images interpretation", X, and "ground 
truth", Y. . 

b) Sampling design 

A two phase sampling design with regression estimates was applied 
combining the two sources of infonaation. 

This design takes advantage of the correlation between X and Y and the 
cost Yatio for collecting the data in each variable. The sample selection was 
done in two steps. In the first step, a relatively large sample in the less 
expensive variable, X in this case, was selected, and in the second step a 
small sample in the other variable that is more expensive to be observed, Y, 
was selected. This bivariate information permits the calibration of the 
information from X. 

When the correlation between X and Y is sufficiently large, a reduction in 
the estimator variance and in the sample size is significant in comparison with 
a single phase sampling design on Y alone. 
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Before epplying the seapling design, the region of study was redefined in 
order to elinlnete parts of It where sugar cane Is not cultivated, and also to 
elininate sone geographical, accident as well as sone iuconplete marginal 
segaents. 

Hie theoretical considerations of the sampling method applied in this 
-study are developed in'Cochran (1963), Loestsch and Haller (1973) and Jessen 
(1978), among other authors. 

4. ANALYSIS AND ESTIMATION 

the samle size in the two phases was set to achieve a sugar cane acreage 
estimate within SI of the corresponding complete acreage evaluation with a 
951 confidence level at a minimum cost. 

To calculate k and n, the number of segaents in the first and second 
phase, respectively, it was used oy and p calculated from the cosqalete 
enumeration of available ground truth data. When this information is not 
availablo, it is necessary to select a pilot sample. 

With the obtained values of k and n, it was simulated a sequence of one 
hundred samples in two phases by means of a simple random selection in each 
phase, being the second phase sample a simple random subsample from the first 
one. 

The siSHilation produced a sequence of values of the random variables Y., 
estimate of the sugar cane acreage; 9 (?g) , variance of the estimate; and 
of D ■ 9p - a, where a ■ 133888 Ha is the total number of hectares cultivated 
with sugar cane from the ground truth complete enumeration. This value is 
asstuaed without error. 

Finally, the simulated data were analysed statistically. 

S. RESULTS 

a) The selected segment size in the uniformity trial was the one with 
diiaension 2x3, corresponding to S.O km x 7.S km - 37. SO km* or 
37S0 ha. The maximum correlation coefficient between X and Y was 

r • .82 for this size, being r « .66 and .73 for segments of size 
6.2S km* and lOO.OCT km* respectively. 

b) The sampling frame, after being redefined (3 b) , had 86 segments with 
an area of 3082 km*, 161 of then incomplete, but with an area greater 
than one half of the complete one. 

• c) The standard deviation of Y was oy ■ 777.46 ha and the correlation 
coefficient pxy ■ .82. The cost ratio considered ci:ct was 1:13, 

. where c\ and ct are the unit cost of observing one Xi (in phase one)' * 

' and one Yi (in phase two), respectively. The sample sizes were 
k a 58 and n > 11 for a relative variance of SI with a confidence 
level of 9S| and a minimum cost of C ■ 201 monetary units. The 
regression equation of ground truth and LANDSAT data was Y ■ 442 « .69X. 

4) The results from thv analysis of the simulated data were N (9b) ■ 

B 132889 ha (99.3l.of the ground truth value). SD (9b) a 1013 ha 
(SDI - II). Range (Ys) - max 9 - min 9 a 29393 ha. D ? M (9g) - a a 
a .999 ha. The confidence interval for the sugar cane acreage at the 
951 level was (130963; 13491S)s (.97.71; 100.11)., and for the mean 
difference at the 951 level was (>3025; 1027). 
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•) The relative efficiency of the two phase desisn with respect to a 
sinple phase was 1S7l» what means a gain of 57t tdien LANDSAT data and 
ground truth data are jointly used in a regression estimator. Finally, 
the total acreage estimate from LANDSAT data only, with a sample of 
size k « S8, was f. ■ 140266 ha (104.81 of the ground truth value) 

The 951 confidence interval was (138924; 141608) = 003. 8t; 10S.'8'>). 
The regression estimate has a tendency to underestimate the true 
value, 99.31 of it, while using only LANDSAT data the estimate 
represents 104.81. The estisuition with aerial photograph data (ground 
truth data in this study) is not being take'ii into consideration 
because of high collecting cost. This is the main reason for 
iaqiilementin^ a sampling design that minimizes the use of that type 
of information. 

6. CONCLUSIONS 

The sampling design applied in this study shows the benefits, in cost and 
precision, relative to a complete inventory through a more expensive method 
of data collecting like aerial photography. 

The authors understand that these results are limited to a snull region 
like the one selected for this study. Besides that, much work has yet to be 
doze in this sampling approach. 
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